Concepedia

Concept

corpus linguistics

Parents

63.9K

Publications

4.1M

Citations

98.9K

Authors

8.9K

Institutions

Table of Contents

Overview

Definition of Corpus Linguistics

is an empirical method for studying through the analysis of text corpora, which are balanced collections of authentic spoken or written texts representing specific varieties.[1.1] These corpora are typically machine-readable and can encompass a wide range of , varieties, and modes, such as the International Corpus of and the British National Corpus, which consists of 100 million words collected in the 1990s.[1.1] The methodology involves both quantitative and qualitative analyses of language use, utilizing large, electronically available collections of naturally occurring texts.[2.1] Corpus linguistics focuses on the study of language based on extensive collections of "real life" language use, stored in corpora—computerized specifically created for linguistic research.[3.1] These principled collections of language examples allow researchers to explore and describe linguistic characteristics and patterns across different contexts, such as informal conversations, formal speeches, and various forms of writing.[4.1] This research method, which emerged in the 1960s, draws on authentic language examples to facilitate a deeper understanding of language structure and use.[5.1]

Importance in Linguistic Research

Corpus linguistics is essential in linguistic research, providing a systematic approach to studying language through the analysis of large, principled collections of naturally occurring texts. A corpus, defined as a collection of language examples selected and compiled in a principled manner, enables researchers to explore and describe linguistic characteristics and patterns across various contexts, such as informal conversations, formal speeches, and written communications.[7.1] This approach is particularly relevant for understanding language variation, as it allows for the examination of differences across social groups and regions.[6.1] The growing connections between sociolinguistics and corpus linguistics highlight how corpora and corpus linguistic methodologies can be employed to investigate core sociolinguistic phenomena.[8.1] In sociolinguistic research, corpus linguistics provides a framework for examining language variation across different social groups, enhancing our understanding of language within social contexts.[9.1] By integrating sociolinguistics with corpus research, scholars can explore the interplay between language and social factors, leading to innovative insights into sociolinguistic phenomena. Corpus linguistics also has practical applications in language education, offering opportunities to revolutionize pedagogical approaches by catering to diverse learner needs at different proficiency levels.[14.1] This adaptability is crucial for creating effective language curricula that reflect authentic language use. The representativeness of a corpus is vital for ensuring that findings can be generalized to the language variety it represents. A corpus is considered representative if it accurately reflects the language variety in question, allowing for generalizable conclusions.[19.1] Achieving representativeness involves defining the scope of the corpus, addressing domain considerations, and ensuring balance and representativeness throughout the data collection and annotation process.[20.1][15.1][16.1] The integration of artificial intelligence (AI) into corpus linguistics marks a significant development in contemporary linguistic research. AI, encompassing machine learning and natural language processing, enhances the capabilities of corpus linguists by developing software tools for corpus analysis, such as AntConc, and supporting multimodal corpus development.[22.1][23.1] The intersection of corpus linguistics with computational linguistics and machine learning presents opportunities for these disciplines to advance linguistic research.[24.1] As AI evolves, it is increasingly seen as a valuable asset, facilitating the work of corpus linguists and reducing the need for significant manual effort or specialized expertise.[21.1]

In this section:

Sources:

History

Early Development and Adoption

The early development of corpus linguistics is fundamentally linked to the establishment of a research method that utilizes authentic language examples, systematically collected and organized into 'corpora' or searchable 'bodies' of data. This method, established in the 1960s, marked a pivotal moment in the field of .[5.1] By employing large, principled collections of naturally occurring language, corpus linguistics enables researchers to explore and describe linguistic characteristics and patterns across various contexts, such as informal conversations, formal speeches, and written communications.[48.1] The modern field is characterized by the computer-aided analysis of extensive text databases, facilitating techniques such as corpus annotation and the study of collocations.[51.1] This methodology is rooted in the empirical study of language through text corpora, which are balanced and often stratified collections of authentic texts representing specific linguistic varieties. The British National Corpus, created in the 1990s, exemplifies this approach with its 100 million word collection of diverse spoken and written texts developed by a consortium of publishers, universities, and the British Library. The of corpus linguistics reveals a close relationship with , as technological advancements have spurred new forms of linguistic analysis.[51.1] Karl Pearson is recognized as a leading figure in 20th-century , having laid the groundwork for statistical analyses widely utilized across various disciplines, including corpus linguistics.[52.1] He and his collaborators developed the core theory, methods, and language of frequentist statistics, which became the prevalent in contemporary science.[55.1] In the early 2000s, the evolution of statistical methods in corpus linguistics continued with the introduction of collostructional analysis, applying the corpus-linguistic concept of collocational analysis to explore associations between constructions and lexical elements.[53.1] This development illustrates the ongoing influence of Pearson's foundational statistical techniques in the field, as they have been adapted to address modern analytical needs in linguistic research.[52.1]

The Corpus Revolution in the 1990s

The 1990s marked a significant turning point in the field of corpus linguistics, often referred to as the "corpus revolution." This period was characterized by the development and application of new methodologies that fundamentally transformed linguistic research. The evolution of corpus linguistics during this time was driven by advancements in computational power, the availability of large corpora, and the emergence of sophisticated statistical methods. The evolution of Natural Language Processing (NLP) and Computational Linguistics has been significantly marked by a transition from traditional grammar-based approaches to the adoption of machine learning methods. Early NLP systems were primarily rooted in and relied on rule-based methods for language parsing and processing. However, these systems faced limitations in handling the nuanced and complex nature of .[59.1] The introduction of , particularly transformer-based , has marked a pivotal shift in the field, enabling more effective analysis of language data and bridging the gap between and machine understanding.[60.1] This advancement has profound implications for future research in corpus linguistics, as it allows for richer insights into discourse patterns and linguistic structures. The rise of (LLMs) and other AI technologies further enhanced the capabilities of corpus linguistics. These tools enabled researchers to conduct more nuanced analyses of complex and , such as that found in corpora, thereby broadening the scope of linguistic inquiry.[71.1] Additionally, software tools like AntConc and Coquery emerged during this period, providing linguists with powerful resources for corpus analysis, including functionalities for wordlists, concordancing, and .[72.1] The corpus revolution has significantly influenced contemporary linguistic research methodologies, leading to a series of paradigm shifts across various subfields of and beyond.[68.1] This evolution is characterized by the application of diverse methods to study discourse patterns, as contemporary corpus linguists utilize a wide variety of approaches to analyze language data.[65.1] The ongoing expansion of corpus sizes, coupled with advancements in computational power and statistical literacy, has further diversified these methodologies, necessitating an intensification of the accompanying research practices.[66.1] Additionally, the application of corpus methods has extended beyond traditional linguistic disciplines, highlighting the methodological tensions in what Hunston (2022) refers to as outward-facing corpus studies.[67.1] This shift underscores the importance of empirical data from corpora in reshaping our understanding of language and its applications in various contexts.[68.1]

In this section:

Sources:

Recent Advancements

Technological Innovations in Corpus Linguistics

Recent technological innovations have profoundly impacted corpus linguistics, particularly through advancements in machine learning and natural language processing (NLP). These developments have enhanced the efficiency and effectiveness of corpus creation and analysis by enabling researchers to manage larger datasets with greater precision. The integration of computational methods from NLP has notably improved text classification and similarity modeling across extensive corpora, offering new insights into linguistic patterns and structures.[98.1] The introduction of large language models (LLMs) has further transformed corpus annotation processes. These sophisticated machine learning models leverage deep neural networks to process vast amounts of text data, significantly reducing the time required for complex annotation tasks by up to 60%, while maintaining the corpora's utility for training machine learning algorithms.[100.1] Additionally, the use of pre-trained LLMs for data augmentation and generation provides a viable alternative to traditional manual annotation methods, which are often resource-intensive.[101.1] In recent years, the creation of extensive and varied corpora has been pivotal to advancements in artificial intelligence and NLP, thereby enriching linguistic research. This progress has been instrumental in driving paradigm shifts across various subfields of applied linguistics, often referred to as the 'corpus revolution' in language teaching and learning contexts.[108.1] The ongoing expansion of corpus sizes and the development of innovative computational tools continue to enhance the scope and depth of linguistic inquiry, underscoring the critical role of empirical data in advancing our understanding of language.[109.1]

Accessibility for Language Teachers

Recent developments in corpus linguistics have led to the creation of an dedicated to the application of corpus-informed materials in the language classroom. This initiative aims to assist educators in effectively incorporating corpus linguistics into their teaching practices. By adopting this approach, educators can enhance their ability to engage students and improve their understanding of language in context.[123.1] Furthermore, the concept of corpus-based (CBLP) has emerged, defined as the ability to integrate corpus linguistics technology into pedagogy.[124.1] This integration allows teachers to expose learners to authentic language, which has been shown to enhance experiences.[125.1] Additionally, the promotion of through the use of corpora enables learners to study at their own pace, further supporting their .[125.1] have also been developed to assist teachers new to corpus-informed teaching. These strategies aim to build confidence among educators and facilitate their engagement with corpus resources, thereby enhancing students' understanding of language in context.[122.1] Despite the progress made, there remains a need for further clarity regarding the essential knowledge required for teachers to effectively utilize corpora in their instruction.[106.1] Overall, these advancements reflect a growing recognition of the value of corpus linguistics in language education and the importance of equipping teachers with the necessary tools and knowledge to implement these resources successfully.

Applications In Language Teaching

Direct Applications

The application of corpus linguistics in language teaching began in the late 1980s and early 1990s, a period marked by significant developments and early contributions from researchers such as Higgins and Johns (1984).[139.1] This period is often referred to as the "corpus revolution," which has led to a series of paradigm shifts in various subfields of applied linguistics, including language teaching and learning.[131.1] A notable advancement during this time is the emergence of corpus-based language pedagogy (CBLP), which focuses on the integration of corpus linguistics technology into . CBLP is defined as "the ability to integrate corpus linguistics technology into classroom language pedagogy".[132.1] This approach emphasizes the use of corpus data to enhance language instruction, thereby providing a more evidence-based framework for teaching methodologies.[132.1] The use of corpora in language education offers a promising avenue for revolutionizing traditional . By analyzing authentic texts, educators can gain insights into the nature, structure, and usage of languages, which can inform teaching strategies and materials.[148.1] Despite its potential, the widespread implementation of corpus tools in pedagogical contexts remains limited.[134.1] However, recent has begun to address this gap by reviewing pedagogical applications of corpora and exploring indirect uses, such as in syllabus and materials .[134.1] Moreover, the integration of technology, particularly artificial intelligence (AI), into corpus linguistics is anticipated to further enhance language teaching practices. As educational institutions increasingly adopt tools, there is a growing interest in how these technologies can work alongside corpus-based methodologies to improve language learning outcomes.[146.1] Future research is expected to focus on various tasks, including the development of corpus research to inform national language curricula and the exploration of AI's role in language education.[147.1]

Indirect Applications

The development of corpus-based language pedagogy (CBLP) has led to significant indirect applications in language teaching, particularly through the integration of corpus technology in classroom settings. Over the past four decades, methodologies in corpus linguistics have undergone paradigm shifts, contributing to what is referred to as the 'corpus revolution' in applied linguistics.[136.1] This evolution has facilitated the differentiation between corpus literacy and corpus-based language pedagogy, highlighting the effectiveness of structured teacher training programs that utilize corpus resources.[137.1] Research indicates that employing corpus-driven approaches can enhance the learning experience for ESL/EFL beginners by exposing them to authentic language data, which is crucial for their understanding of linguistic structures.[138.1] Such exposure not only aids in vocabulary but also helps learners produce language that is more native-like.[138.1] Furthermore, the application of various analytical approaches, including corpus-based analysis, has provided educators with tools to better understand and address the language needs of their students.[138.1]

In this section:

Sources:

Methodological Approaches

Quantitative vs. Qualitative Analyses

Quantitative and qualitative analyses are two distinct yet complementary methodological approaches within the field of corpus linguistics. Corpus linguistics is defined as the study of language data on a large scale, specifically involving the computer-aided analysis of extensive collections of transcribed utterances or written texts.[176.1] A corpus is characterized as a collection of examples of language in use that are selected and compiled in a principled manner, allowing researchers to explore and describe linguistic characteristics and patterns associated with language use in various contexts, such as informal conversations, formal speeches, and written communications.[177.1] By utilizing these large, principled collections of naturally occurring language, corpus linguistics facilitates the accurate examination of how language varies across different speakers and regions.[177.1] Qualitative corpus analysis is a methodology that focuses on in-depth investigations of linguistic phenomena, grounded in the context of authentic communicative situations that are digitally represented.[201.1] Researchers employing this approach adopt an exploratory, inductive stance, which allows for an empirically based study of how the meanings and functions of linguistic forms interact with various ecological characteristics of language used for .[202.1] The methodological foundations of qualitative corpus analysis include deliberate and specialized corpus markup, annotation, retrieval, and , which are essential for conducting thorough and grounded investigations in this field.[203.1] The integration of , such as interviews or , with quantitative approaches in corpus linguistics can significantly enhance our understanding of language in social contexts. This combination is particularly relevant in the 21st century, where the necessary conditions for scientific research increasingly require the use of quantitative approaches to data analysis.[200.1] By employing both methodologies, researchers can leverage the strengths of corpus linguistics, making these methods attractive to the and providing a more nuanced view of language use and its complexities.

Corpus Design and Construction

The design and construction of corpora in corpus linguistics are critical to ensuring that the data collected is representative and suitable for the research questions posed. Representativeness is a key requirement, as it ensures that the corpus adequately reflects the language variety being studied. This involves balancing the distribution of text types and genres to avoid overrepresentation of certain categories, as well as determining the appropriate corpus size based on the research question and available resources.[194.1] Randi Reppen (2010) emphasizes that the question of corpus size is typically resolved by two factors: representativeness and practicality, which includes considerations of time constraints.[195.1] As the compilation and study of language corpora become increasingly sophisticated, researchers face challenges related to data selection and interpretation. Continuous is needed to address these challenges, particularly in ensuring that the corpus is representative of the language variety being studied.[193.1] The integration of metadata is also crucial, as it aids in evaluating the representativeness of an existing corpus, which is essential for the validity of linguistic analyses.[192.1] Advancements in technology, particularly the integration of artificial intelligence (AI) and machine learning, are transforming traditional corpus linguistics methodologies. These innovative methods are designed to better serve the tasks required by corpus linguistics, enabling researchers to analyze complex and 'noisy' data, such as that found in social media corpora.[189.1] The application of large language models (LLMs) is particularly noteworthy, as they provide a deep and nuanced view of language use across various domains and registers.[190.1] Over the past 40 years, the development of methods in corpus linguistics has led to significant paradigm shifts in many subfields of applied linguistics, marking what is often referred to as the 'corpus revolution'.[108.1] Overall, the integration of AI techniques holds immense promise for unlocking new insights into language acquisition and analysis, paving the way for future advancements in the field of linguistics.[191.1]

Challenges And Future Directions

Issues of Representativeness in Corpora

The representativeness of linguistic corpora is a critical issue in corpus linguistics, as it directly impacts the validity and applicability of research findings. A corpus is defined as "a finite-size body of machine-readable text, sampled in order to be maximally representative of the language variety under consideration".[234.1] This definition underscores the importance of careful and representativeness in corpus compilation. However, challenges persist in achieving these goals, particularly as the field grapples with both longstanding and emerging issues related to data collection and analysis. One significant challenge is the growing size of corpora and the expanding possibilities for data gathering, which can complicate the representativeness of the samples collected. As noted, while some old issues remain, new problems have also emerged in the compilation and study of language corpora.[247.1] This evolution necessitates ongoing discussions about the methodologies employed in corpus linguistics, especially in relation to sociolinguistic studies and the representation of diverse language varieties. The intersection of corpus linguistics with sociolinguistic studies, particularly in the context of language and sexuality, has been explored through various methodologies that enhance research in this area. An overview of previous corpus linguistic work highlights the compatibility of corpus linguistic methodology with queer linguistics, which serves as a central theoretical approach in language and sexuality studies.[217.1] This methodology has been effectively applied to examine language usage and the representation of and practices, as demonstrated in a focusing on changing press discourses surrounding a gay professional football player, Justin Fashanu.[218.1] While many methods in linguistic studies are often confined to specific areas, corpus linguistics offers a versatile approach that can be utilized across all three core areas of linguistic studies: linguistic structures and their usage, sociolinguistic studies, and discursive studies on gender.[219.1] Addressing these issues of representativeness is essential for enhancing the and usability of corpus data. Scholars must continue to refine their approaches to corpus compilation and annotation, ensuring that the resulting datasets are not only comprehensive but also reflective of the present in the target populations.

Collaborative Efforts in Research and Development

Collaborative efforts in research and development within corpus linguistics are increasingly recognized as vital for advancing the field. Future directions in research highlight the potential for participatory approaches in , which interrogate the universality and global relevance of existing research on corpus linguistics and language teaching to date.[212.1] This agenda identifies specific research trajectories for the corpus revolution, proposing five distinct research tasks aimed at exploring and enhancing the application of corpus linguistics.[213.1] Moreover, the integration of theory into language, gender, and sexuality research is highlighted as a significant area for development. This integration is crucial for adequately describing and , which can be enriched through collaborative methodologies that consider diverse identities.[220.1] In the context of language education, the potential of (DDL) as a pedagogical tool is acknowledged, although challenges remain that may limit its effectiveness. Effective strategies such as tailored tasks, auxiliary guidance, and peer learning have been identified as means to facilitate meaningful corpus engagement, particularly for lower-proficiency students.[222.1] Despite the enthusiasm for incorporating corpus-based resources into language teaching, significant challenges persist. Many educators express a keen interest in utilizing corpus linguistics but face obstacles related to technology use and the design of corpus-based materials.[241.1] Furthermore, the lack of adequate training in both pre-service and in-service contexts has hindered the integration of corpus methodologies into classroom practices.[244.1] Nevertheless, there is a growing recognition of the benefits of corpus-based approaches, which have begun to gain traction among English teachers. Research indicates that these approaches can significantly enhance both teaching and , suggesting a positive trajectory for future collaborative efforts in the field.[246.1]

In this section:

Sources:

References

en.wikipedia.org favicon

wikipedia

https://en.wikipedia.org/wiki/Corpus_linguistics

[1] Corpus linguistics - Wikipedia Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety. Today, corpora are generally machine-readable data collections. Other corpora represent many languages, varieties and modes, and include the International Corpus of English, and the British National Corpus, a 100 million word collection of a range of spoken and written texts, created in the 1990s by a consortium of publishers, universities (Oxford and Lancaster) and the British Library. (1 January 2006), "Corpus Linguistics", in Brown, Keith (ed.), Encyclopedia of Language & Linguistics (Second Edition), Oxford: Elsevier, pp.

english-linguistics.uni-mainz.de favicon

uni-mainz

https://www.english-linguistics.uni-mainz.de/corpus-linguistics/

[2] Corpus Linguistics | ENGLISH LINGUISTICS Corpus linguistics is a methodology that involves computer-based empirical analyses (both quantitative and qualitative) of language use by employing large, electronically available collections of naturally occurring spoken and written texts, so-called corpora.

thoughtco.com favicon

thoughtco

https://www.thoughtco.com/what-is-corpus-linguistics-1689936

[3] Definition and Examples of Corpus Linguistics - ThoughtCo Definition and Examples of Corpus Linguistics Definition and Examples of Corpus Linguistics Corpus linguistics is the study of language based on large collections of "real life" language use stored in corpora (or corpuses)—computerized databases created for linguistic research. – Hans Lindquist, Corpus Linguistics and the Description of English. "In the context of the classroom the methodology of corpus linguistics is congenial for students of all levels because it is a 'bottoms-up' study of the language requiring very little learned expertise to start with. – Douglas Biber, Susan Conrad, and Randi Reppen, Corpus Linguistics: Investigating Language Structure and Use, Cambridge University Press, 2004 "Definition and Examples of Corpus Linguistics." ThoughtCo, Jun. 25, 2024, thoughtco.com/what-is-corpus-linguistics-1689936. Definition and Examples of Corpus Linguistics.

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/topics/social-sciences/corpus-linguistics

[4] Corpus Linguistics - an overview | ScienceDirect Topics A corpus is defined as a collection of examples of language in use that are selected and compiled in a principled way and corpus linguistics as linguistic studies of such corpora. By using large, principled collections of naturally occurring language, corpus linguistics can accurately explore and describe linguistic characteristics and patterns associated with language use in different contexts (e.g., talking among friends, giving a formal speech, writing a friend, writing a research paper), across different speakers, and how language varies regionally (see Corpus linguistics provides a tool for exploring linguistic characteristics and patterns associated with language use in different contexts (e.g., talking among friends, giving a formal speech, writing a research paper).

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/366026172_The_History_of_Corpus_Linguistics

[5] The History of Corpus Linguistics - ResearchGate Corpus linguistics is a research method which draws on authentic language examples, collected and organized into 'corpora', or searchable 'bodies' of data. The method was established in the 1960s

onlinelibrary.wiley.com favicon

wiley

https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-971X.2008.00532.x

[6] Language variation and corpus linguistics - KACHRU - 2008 - World ... ABSTRACT: Corpus linguistics deserves serious attention from linguists and applied linguists, since it is of direct relevance to linguistic description, language variation, lexicography, and language education. Linguists tend to be indifferent to corpora, however, as the predominant paradigm in linguistics is based on introspective data, i.e. native speaker intuition.

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/topics/social-sciences/corpus-linguistics

[7] Corpus Linguistics - an overview | ScienceDirect Topics A corpus is defined as a collection of examples of language in use that are selected and compiled in a principled way and corpus linguistics as linguistic studies of such corpora. By using large, principled collections of naturally occurring language, corpus linguistics can accurately explore and describe linguistic characteristics and patterns associated with language use in different contexts (e.g., talking among friends, giving a formal speech, writing a friend, writing a research paper), across different speakers, and how language varies regionally (see Corpus linguistics provides a tool for exploring linguistic characteristics and patterns associated with language use in different contexts (e.g., talking among friends, giving a formal speech, writing a research paper).

slaap.chass.ncsu.edu favicon

ncsu

https://slaap.chass.ncsu.edu/pdfs/Kendall2011_BJAL_CorpSocioling.pdf

[8] PDF There are also growing connections between sociolinguistics and corpus linguistics in terms of specific research. For instance, Torgersen, Gabrielatos, Hoffmann, and Fox (2011) provide an excellent example of how corpora and corpus linguistic methodologies can be used to pursue core sociolinguistic

open-access.bcu.ac.uk favicon

bcu

https://www.open-access.bcu.ac.uk/12268/1/How+to+use+corpus+linguistics+in+sociolinguistics.pdf

[9] PDF Using Corpora in Sociolinguistic Research Framing sociolinguistics within corpus research In this chapter we consider how corpora can be used in order to carry out research from a sociolinguistic perspective. Sociolinguistics is a somewhat broad term, with Labov (1972: 183) indicating that it can appear redundant as all language is social.

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2405844023099395

[14] Language corpus and data driven learning (DDL) in language classrooms ... The use of corpora in language teaching and learning presents a promising prospect for revolutionising the way languages are taught and learned [ [, , , ]].In linguistic research, corpus linguistics involves the gathering and analysis of collections of authentic texts to provide evidence for describing the nature, structure, and use of languages.

cambridge.org favicon

cambridge

https://www.cambridge.org/core/books/designing-and-evaluating-language-corpora/corpus-design-and-representativeness-in-practice-with-daniel-keller/B584F94DD402ACE12B1B9384271B7B52

[15] Corpus Design and Representativeness in Practice - With Daniel Keller ... We propose that the representativeness of a corpus directly depends on its suitability for a specific research goal (including the domain and the linguistic feature (s) of interest). Creating a new corpus involves establishing linguistic research question (s), addressing domain considerations, including describing the domain, operationalizing the domain, evaluating the operational domain

medium.com favicon

medium

https://medium.com/@riazleghari/building-a-corpus-4bd02f69f5c5

[16] Building a Corpus. How to Build a Corpus? | by Riaz Laghari | Medium You can build a corpus that provides valuable linguistic insights by following these steps — defining the scope, collecting and annotating data, and ensuring balance and representativeness.

lancaster.ac.uk favicon

lancaster

https://www.lancaster.ac.uk/fass/projects/corpus/ZJU/xCBLS/chapters/A02.pdf

[19] PDF 2.2 What does representativeness mean in corpus linguistics? What does representativeness mean in corpus linguistics? According to Leech (1991: 27), a corpus is thought to be representative of the language variety it is supposed to represent if the findings based on its contents can be generalized to the said language variety. Biber (1993: 243) defines representativeness from the viewpoint of

lancaster.ac.uk favicon

lancaster

https://www.lancaster.ac.uk/fass/projects/corpus/ZJU/xCBLS/chapters/B01.pdf

[20] PDF Representativeness refers to the extent to which a sample includes the full range of variability in a population. In corpus design, variability can be considered from situational and from linguistic perspectives, and both of these are important in determining representativeness.

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2666799123000278

[21] Hack your corpus analysis: How AI can assist corpus linguists deal with ... Applied Corpus Linguistics. Volume 3, Issue 3, December 2023, ... This short reflection considers the role of Artificial Intelligence (AI) language models in supporting the work of corpus linguists. ... significant manual effort or specialised expertise on the part of the corpus linguist. AI, such as language models and machine learning

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2666799123000424

[22] Generative AI for corpus approaches to discourse studies: A critical ... Understood as a form of intelligence that is both similar to and distinct from human intelligence (Korteling et al., 2021), artificial intelligence (AI) and its subfields of machine learning and natural language processing (Shoenbill et al., 2023) play an important role in contemporary linguistics research.AI is argued to emulate human cognition (Shneiderman, 2020), a capacity which has given

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2666799123000424

[23] Generative AI for corpus approaches to discourse studies: A critical ... Subsuming both natural language processing and machine learning, AI has been used to develop software for conducting corpus analysis, such as AntConc (Anthony, 2023), to enhance approaches to multimodal corpus development (Zhou and Gao, 2023), and to inform tools that make use of corpus linguistics for applied purposes (e.g., corpus-informed

fis.uni-bamberg.de favicon

uni-bamberg

https://fis.uni-bamberg.de/bitstreams/0bcc1348-3714-4bde-8d4c-0775b3c1213b/download

[24] Introduction : Comparative Approaches to Data and Methods in Corpus ... The final part takes the examination of data and methods to regions where corpus linguistics meets computational linguistics and machine learning, two overlapping disciplines that have potential for supplementing and advancing linguistic research.

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/topics/social-sciences/corpus-linguistics

[48] Corpus Linguistics - an overview | ScienceDirect Topics A corpus is defined as a collection of examples of language in use that are selected and compiled in a principled way and corpus linguistics as linguistic studies of such corpora. By using large, principled collections of naturally occurring language, corpus linguistics can accurately explore and describe linguistic characteristics and patterns associated with language use in different contexts (e.g., talking among friends, giving a formal speech, writing a friend, writing a research paper), across different speakers, and how language varies regionally (see Corpus linguistics provides a tool for exploring linguistic characteristics and patterns associated with language use in different contexts (e.g., talking among friends, giving a formal speech, writing a research paper).

academic.oup.com favicon

oup

https://academic.oup.com/edited-volume/28195/chapter/213161114

[51] The History of Corpus Linguistics - Oxford Academic The modern field of corpus linguistics - based around the computer-aided analysis of extremely large databases of text - is largely a phenomenon ... context that techniques such as corpus annotation, and important concepts such as collocation, emerged. Alongside this history of corpus linguistics considered as a methodology stands the

lancsbox.lancs.ac.uk favicon

lancs

https://lancsbox.lancs.ac.uk/history/

[52] History of corpus linguistics - Lancaster University Corpus linguistics milestones. Mechanical calculating machines. Karl Pearson used Brunsviga for his calculations. He also laid the groundwork for statistical analyses which are still widely used in many disciplines including corpus linguistics. 1900s. Brown corpus on punch cards.

pmc.ncbi.nlm.nih.gov favicon

nih

https://pmc.ncbi.nlm.nih.gov/articles/PMC11544590/

[53] Corpus linguistics meets historical linguistics and construction ... The early 2000s also saw the development of collostructional analysis (Gries and Stefanowitsch 2004; Stefanowitsch and Gries 2003), which applied the corpus-linguistic concept of collocational analysis to the study of associations between constructions and lexical elements, thereby building a conceptual bridge from corpus linguistics to

semanticscholar.org favicon

semanticscholar

https://www.semanticscholar.org/paper/Karl-Pearson-and-the-Logic-of-Science:-Renouncing-Stern/a18157e8869a25c47eb4a179c1a207b8b0f071d3

[55] [PDF] Karl Pearson and the Logic of Science: Renouncing Causal ... Karl Pearson is the leading figure of XX century statistics. He and his co-workers crafted the core of the theory, methods and language of frequentist or classical statistics -- the prevalent inductive logic of contemporary science. However, before working in statistics, K.Pearson had other interests in life, namely, in this order, philosophy, physics, and biological heredity. Key concepts of

ieeexplore.ieee.org favicon

ieee

https://ieeexplore.ieee.org/document/10866415

[59] Bridging Linguistics and Machine Learning: A New Human-Machine ... This research paper explores the evolution of Natural Language Processing (NLP) and Computational Linguistics, tracing the transition from grammar-based approaches to the adoption of machine learning methods. Initially rooted in linguistic theories, early NLP systems relied on rule-based methods for language parsing and processing, albeit with limitations in handling natural language

arxiv.org favicon

arxiv

https://arxiv.org/html/2503.20227

[60] Advancements in Natural Language Processing: Exploring Transformer ... This paper is aiming to bridge the gap between human communication and machine understanding. Early systems relied on rule-based methods and statistical models, which often struggled with the nuanced and complex nature of natural language. The introduction of deep learning, particularly transformer-based architectures, marked a pivotal shift.

taylorfrancis.com favicon

taylorfrancis

https://www.taylorfrancis.com/books/edit/10.4324/9781315724812/triangulating-methodological-approaches-corpus-linguistic-research-paul-baker-jesse-egbert

[65] Triangulating Methodological Approaches in Corpus Linguistic Research Contemporary corpus linguists use a wide variety of methods to study discourse patterns. This volume provides a systematic comparison of various methodological approaches in corpus linguistics through a series of parallel empirical studies that use a single corpus dataset to answer the same overarching research question.

cambridge.org favicon

cambridge

https://www.cambridge.org/core/books/data-and-methods-in-corpus-linguistics/7EE2A7E26A8399A2DD386970235ED436

[66] Data and Methods in Corpus Linguistics - Cambridge University Press ... Corpus linguistics continues to be a vibrant methodology applied across highly diverse fields of research in the language sciences. With the current steep rise in corpus sizes, computational power, statistical literacy and multi-purpose software tools, and inspired by neighbouring disciplines, approaches have diversified to an extent that calls for an intensification of the accompanying

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2772766124000478

[67] Epistemologies of corpus linguistics across disciplines Recognising the potential for such an outcome, this paper seeks to bring to the fore the underlying methodological tensions found in the use of corpus linguistics in what Hunston (2022) has designated as outward-facing corpus studies. That is, the application of corpus methods in research that lies outside the interest of major linguistic disciplines.

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/language-teaching/article/corpus-linguistics-for-language-teaching-and-learning-a-research-agenda/CABC78CDD77B7799BE293F7F66A48352

[68] Corpus linguistics for language teaching and learning: A research ... 1. Introduction. Over the last 40 years, the development of methods in corpus linguistics and theories based on corpus data has brought about a series of paradigm shifts in many subfields of applied linguistics and beyond (Hunston, Reference Hunston 2022).In the context of language teaching and learning, this development was seen to be a part of the so-called 'corpus revolution' (Rundell

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2666799123000278

[71] Hack your corpus analysis: How AI can assist corpus linguists deal with ... Hack your corpus analysis: How AI can assist corpus linguists deal with messy social media data - ScienceDirect Hack your corpus analysis: How AI can assist corpus linguists deal with messy social media data This short reflection considers the role of Artificial Intelligence (AI) language models in supporting the work of corpus linguists. The focus is on how AI can be used in the analysis of complex and 'noisy' data commonly found in social media corpora. The main objective is to guide the reader through the author's reasoning process as a way of reflecting on the integration of AI technology into the academic practice of corpus linguists. Next article in issue For all open access content, the relevant licensing terms apply.

corpus-analysis.com favicon

corpus-analysis

https://corpus-analysis.com/

[72] Tools for Corpus Linguistics | ANNIS ✎ | Search and visualization tool for multi-layer linguistic corpora with diverse types of annotation | search, visualization | Web (or Linux, Mac, Windows) | Free | | AntConc ✎ | Corpus analysis toolkit | wordlists, concordancer, keywords | Linux, Mac, Windows | Free | | Coquery ✎ | A free corpus query tool to search, analyze, and visualize corpora | query, visualization | Linux, Mac, Windows | Free | | Dexter ✎ | Tool for text annotation | annotation | Linux, Mac, Windows | Free | | ProtAnt ✎ | Tool for prototypical text analysis | wordlists | Windows, Mac | Free | | TreeTagger ✎ | Tool for annotating text with part-of-speech and lemma information | pos tagger, annotation | Windows, Mac, Linux | Free |

cambridge.org favicon

cambridge

https://www.cambridge.org/core/elements/natural-language-processing-for-corpus-linguistics/1063EED446D505D33E0FAB43BDA98DF5

[98] Natural Language Processing for Corpus Linguistics Corpus analysis can be expanded and scaled up by incorporating computational methods from natural language processing. This Element shows how text classification and text similarity models can extend our ability to undertake corpus linguistics across very large corpora.

aclanthology.org favicon

aclanthology

https://aclanthology.org/2021.ranlp-1.26/

[100] Active Learning for Assisted Corpus Construction: A Case Study in ... Overall, our preliminary experiments suggest that as much as 60% of the annotation time could be saved while producing corpora that have the same usefulness for training machine learning algorithms. An open-source computational tool that implements the aforementioned strategies is presented and published online for the research community.

ncbi.nlm.nih.gov favicon

nih

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11258409/

[101] Annotation-preserving machine translation of English corpora to ... Creating an annotated clinical corpus in any language is resource-intensive, requiring significant labor to manually annotate numerous clinical texts in great detail. 10, 31 The use of pre-trained LLMs for data augmentation and generation to create new annotated corpora has been proposed as an alternative to the manual annotation effort. 32

tandfonline.com favicon

tandfonline

https://www.tandfonline.com/doi/full/10.1080/09588221.2022.2040537

[106] Full article: Teacher paths for developing corpus-based language ... 2.2. CBLP: a corpus-based TPACK. Although increasing efforts have been made in recent years to offer corpus training for both pre- and in-service teachers, Ma et al. (Citation 2021) argued that it remains unclear as to what knowledge requires developing to teach with corpora.To understand the essential knowledge needed in teacher training, they proposed and tested a two-dimension teacher

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/language-teaching/article/corpus-linguistics-for-language-teaching-and-learning-a-research-agenda/CABC78CDD77B7799BE293F7F66A48352

[108] Corpus linguistics for language teaching and learning: A research ... 1. Introduction. Over the last 40 years, the development of methods in corpus linguistics and theories based on corpus data has brought about a series of paradigm shifts in many subfields of applied linguistics and beyond (Hunston, Reference Hunston 2022).In the context of language teaching and learning, this development was seen to be a part of the so-called 'corpus revolution' (Rundell

medium.com favicon

medium

https://medium.com/@NameIsNavin/tracing-the-evolution-of-corpus-data-from-historical-foundations-to-future-frontiers-5dcdb7f0bf59

[109] Tracing the Evolution of Corpus Data: From Historical ... - Medium In addition to improving linguistic research, the creation of extensive and varied corpora has been crucial to advancements in artificial intelligence and Natural Language Processing.

assets.cambridge.org favicon

cambridge

https://assets.cambridge.org/97805211/46081/frontmatter/9780521146081_frontmatter.pdf

[122] PDF Professional Development for Language Teachers: Strategies for ... Classroom together with its companion Web site will enable teachers new to corpus-informed teaching to overcome possible inhibitions about the ... Corpus linguistics allows teachers and learners to be confident that they are learning the language

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/389395005_Corpus_linguistics_for_language_teaching_and_learning_A_research_agenda

[123] (PDF) Corpus linguistics for language teaching and ... - ResearchGate corpus linguistics by working with them to develop an open education resource dedicated to the appli- cation of corpus-informed materials in the language classroom. Adopting this approach may be a

onlinelibrary.wiley.com favicon

wiley

https://onlinelibrary.wiley.com/doi/full/10.1002/tesq.3281

[124] Corpora in English Language Teacher Education: Research, Integration ... A second stream in the literature relates to CL integration with a focus on how it can be used for teaching rather than learning. This has been called corpus-based language pedagogy (CBLP), defined as "the ability to integrate corpus linguistics technology into classroom language pedagogy" (Ma, Tang, & Lin, 2021, p. 2734).

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2666799123000084

[125] Becoming corpus literate: An in-service EFL teacher education framework ... For many years, applied linguists have highlighted the benefits of using corpora in the language learning classroom. These benefits include the exposure of learners to authentic language (O'Keeffe et al., 2007) and the promotion of autonomous learning through the use of corpora, enabling learners to learn at a time and pace that suits them best (Boulton, 2010b).

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/language-teaching/article/corpus-linguistics-for-language-teaching-and-learning-a-research-agenda/CABC78CDD77B7799BE293F7F66A48352

[131] Corpus linguistics for language teaching and learning: A research ... 1. Introduction. Over the last 40 years, the development of methods in corpus linguistics and theories based on corpus data has brought about a series of paradigm shifts in many subfields of applied linguistics and beyond (Hunston, Reference Hunston 2022).In the context of language teaching and learning, this development was seen to be a part of the so-called 'corpus revolution' (Rundell

onlinelibrary.wiley.com favicon

wiley

https://onlinelibrary.wiley.com/doi/full/10.1002/tesq.3281

[132] Corpora in English Language Teacher Education: Research, Integration ... A second stream in the literature relates to CL integration with a focus on how it can be used for teaching rather than learning. This has been called corpus-based language pedagogy (CBLP), defined as "the ability to integrate corpus linguistics technology into classroom language pedagogy" (Ma, Tang, & Lin, 2021, p. 2734).

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/annual-review-of-applied-linguistics/article/abs/corpus-research-applications-in-second-language-teaching/020AD888ADB320561C8FABA89B32E753

[134] Corpus Research Applications in Second Language Teaching Still, corpora and corpus tools have yet to be widely implemented in pedagogical contexts. The aim of this article is to provide an overview of pedagogical corpus applications and to review recent publications in the area of corpus linguistics and language teaching. It covers indirect corpus applications, such as in syllabus or materials design

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/language-teaching/article/corpus-linguistics-for-language-teaching-and-learning-a-research-agenda/CABC78CDD77B7799BE293F7F66A48352

[136] Corpus linguistics for language teaching and learning: A research ... 1. Introduction. Over the last 40 years, the development of methods in corpus linguistics and theories based on corpus data has brought about a series of paradigm shifts in many subfields of applied linguistics and beyond (Hunston, Reference Hunston 2022).In the context of language teaching and learning, this development was seen to be a part of the so-called 'corpus revolution' (Rundell

tandfonline.com favicon

tandfonline

https://www.tandfonline.com/doi/full/10.1080/09588221.2021.1895225

[137] The development of corpus-based language pedagogy for TESOL teachers: a ... The results support a differentiation between corpus literacy and corpus-based language pedagogy, attesting to the effectiveness of the two-step corpus-based teacher training. The study provides several insights regarding how to scaffold teachers in corpus-based training and teach students with corpus resources to address their vocabulary needs

files.eric.ed.gov favicon

ed

https://files.eric.ed.gov/fulltext/EJ1312960.pdf

[138] PDF September 2018 Arab World English Journal www.awej.org ISSN: 2229-9327 87 However, it is worth to be noted that researchers in the field of linguistics have distinctively devised different perspective concerning the issues of learners language and deployed several analytical approaches to analyse the samples of data collected from L2 learners such as Contrastive Analysis (CA); Error Analysis (EA); obligatory occasion analysis; frequency analysis; functional analysis; computer-based analysis (such as corpus-based analysis) and a host of others. Corpus as a guided teaching of language from studying the linguistics pattern in more general passion rather than units, this will help ESL/EFL beginners to enhance their knowledge of structure of a particular language as one of their basic needs as beginners is to interact with natural occurring language data (Krashen’s (1981) second language acquisition theory)’, this can be provided through corpus-driven approach, “exposing to authentic English and producing native-like English through corpus are of significance for many EFL students as beginners or intermediate ones” (Proctor, 2012, p.

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/291832144_Corpora_in_Language_Teaching

[139] (PDF) Corpora in Language Teaching - ResearchGate Applications of corpus linguistics to language teaching began in the late eighties. and early nineties. Examples of early work are Higgins and Johns (1984), Higgins ... Textual patterns: Key words

jlt.ac favicon

jlt

https://jlt.ac/home/article/view/102

[146] The linguistic leap: Understanding, evaluating, and integrating AI in ... The landscape of language education is undergoing a pivotal transformation, spurred by the integration of Generative Artificial Intelligence (Gen-AI) into every facet of traditional and new methodologies and practices. Given the rapid societal adoption of AI, we believe that all language instructors - from the most technologically savvy to the most tech-averse - must engage critically and

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/language-teaching/article/corpus-linguistics-for-language-teaching-and-learning-a-research-agenda/CABC78CDD77B7799BE293F7F66A48352

[147] Corpus linguistics for language teaching and learning: A research ... This agenda identifies future research trajectories for the corpus revolution, proposing five specific research tasks designed to explore and advance the application of corpus linguistics in language education. These tasks focus on: (1) contrastive data-driven learning, (2) the development of corpus research for informing national language curricula, (3) the use of artificial intelligence for

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2405844023099395

[148] Language corpus and data driven learning (DDL) in language classrooms ... The use of corpora in language teaching and learning presents a promising prospect for revolutionising the way languages are taught and learned [ [, , , ]].In linguistic research, corpus linguistics involves the gathering and analysis of collections of authentic texts to provide evidence for describing the nature, structure, and use of languages.

cambridge.org favicon

cambridge

https://www.cambridge.org/core/books/data-and-methods-in-corpus-linguistics/comparative-approaches-to-data-and-methods-in-corpus-linguistics/CE338DCD0485A7A6E54C8BAF0D5AB459

[176] Comparative Approaches to Data and Methods in Corpus Linguistics ... In the Introduction, the editors describe the motivations and aims underlying the publication of the book against the background of the importance of corpus linguistics in current research and the associated methodological diversification. The didactic orientation of the book is outlined, as is its organization into four major parts and the contribution made by each chapter. Going beyond a

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/341313168_Corpus_Linguistics_A_Guide_to_the_Methodology

[177] Corpus Linguistics: A Guide to the Methodology | Request PDF The first part introduces the reader to the general methodological discussions surrounding corpus data as well as the practice of doing corpus linguistics, including issues such as the scientific

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2666799123000278

[189] Hack your corpus analysis: How AI can assist corpus linguists deal with ... Hack your corpus analysis: How AI can assist corpus linguists deal with messy social media data - ScienceDirect Hack your corpus analysis: How AI can assist corpus linguists deal with messy social media data This short reflection considers the role of Artificial Intelligence (AI) language models in supporting the work of corpus linguists. The focus is on how AI can be used in the analysis of complex and 'noisy' data commonly found in social media corpora. The main objective is to guide the reader through the author's reasoning process as a way of reflecting on the integration of AI technology into the academic practice of corpus linguists. Next article in issue For all open access content, the relevant licensing terms apply.

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/373819543_Corpus_AI_Integrating_Large_Language_Models_LLMs_into_a_Corpus_Analysis_Toolkit

[190] Corpus AI: Integrating Large Language Models (LLMs) into a Corpus ... Large Language Models (LLMs) have the potential to play a pivotal role in corpus linguistics research, providing a deep and nuanced view of language use in a variety of domains and registers.

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/379239839_Artificial_Intelligence_in_Linguistics_Research_Applications_in_Language_Acquisition_and_Analysis

[191] Artificial Intelligence in Linguistics Research: Applications in ... Overall, the integration of AI techniques holds immense promise for unlocking new insights into language acquisition and analysis, paving the way for future advancements in the field of linguistics.

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S277276612400051X

[192] Representativeness and metadata presentation in learner/child corpora ... Representativeness is a key requirement in corpus linguistics, and the evaluation of the representativeness of an existing corpus depends on the provision of metadata. The present paper discusses challenges to both representativeness and metadata presentation based on our experiences in compiling corpora of school writing from young learners.

benjamins.com favicon

benjamins

https://benjamins.com/catalog/scl.118

[193] Challenges in Corpus Linguistics: Rethinking corpus compilation and ... As the compilation and study of language corpora gets increasingly sophisticated and complex, continuous attention on ways of dealing with the data in question and challenges in text selection and interpretation is needed. The contributions to this volume address problems relating to a variety of areas in corpus linguistic study, including

libguides.usc.edu favicon

usc

https://libguides.usc.edu/c.php?g=1443977&p=10726957

[194] Corpora and Text/Data Mining For Digital Humanities Projects Representativeness: Ensure your corpus adequately reflects the language variety you are studying. Balance: Distribute text types and genres proportionally to avoid overrepresentation of certain categories. Size: Determine the appropriate corpus size depending on your research question and available resources.

wac.colostate.edu favicon

colostate

https://wac.colostate.edu/docs/books/scale/chapter4.pdf

[195] PDF on the ideal size of a corpus, Randi Reppen (2010) wrote that "for most ques-tions that are pursued by corpus researchers, the question of size is resolved by two factors: representativeness (have I collected enough texts (words) to accu-rately represent the type of language under investigation?) and practicality (time constraints)" (p. 32).

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2772766124000478

[200] Epistemologies of corpus linguistics across disciplines The combination of corpus approaches with qualitative methods widely used in education, such as interviews or ethnography, ... the necessary conditions for scientific research in the 21st century require the use quantitative approaches to data analysis. In this context, corpus linguistics methods can be attractive to social sciences and

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/277690305_Qualitative_Corpus_Analysis

[201] (PDF) Qualitative Corpus Analysis - ResearchGate Qualitative corpus analysis is a methodology for pursuing in-depth investigations of linguistic phenomena, as grounded in the context of authentic, communicative situations that are digitally

onlinelibrary.wiley.com favicon

wiley

https://onlinelibrary.wiley.com/doi/10.1002/9781405198431.wbeal0974.pub2

[202] Qualitative Corpus Analysis - Hasko - Wiley Online Library Researchers using qualitative corpus analysis as the methodological basis for their investigations adopt an exploratory, inductive approach to the empirically based study of how the meanings and functions of linguistic forms found in a specific corpus interact with diverse ecological characteristics of language used for communication (speaker

onlinelibrary.wiley.com favicon

wiley

https://onlinelibrary.wiley.com/doi/full/10.1002/9781405198431.wbeal0974.pub3

[203] Qualitative Corpus Analysis - HASKO - Wiley Online Library Methodological foundations of conducting qualitative corpus analysis are reviewed, with a particular focus on the deliberate and specialized corpus markup, annotation, retrieval, and interpretation that are required for the in-depth and grounded investigations carried out by linguists interested in qualitative approaches to corpus studies.

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/language-teaching/article/corpus-linguistics-for-language-teaching-and-learning-a-research-agenda/CABC78CDD77B7799BE293F7F66A48352

[212] Corpus linguistics for language teaching and learning: A research ... Subsequently, future directions in research are highlighted, identifying the potential for participatory approaches to research design and interrogating the universality and global relevance of research on corpus linguistics and language teaching to-date. Finally, a brief conclusion is offered in Section 4.

researchgate.net favicon

researchgate

https://www.researchgate.net/publication/389395005_Corpus_linguistics_for_language_teaching_and_learning_A_research_agenda

[213] Corpus linguistics for language teaching and learning: A research agenda This agenda identifies future research trajectories for the corpus revolution, proposing five specific research tasks designed to explore and advance the application of corpus linguistics in

benjamins.com favicon

benjamins

https://benjamins.com/catalog/jls.17019.mot

[217] Corpus linguistics in language and sexuality studies As an introduction to the special issue, this paper presents an overview of previous corpus linguistic work in the field of language and sexuality and discusses the compatibility of corpus linguistic methodology with queer linguistics as a central theoretical approach in language and sexuality studies. The discussion is structured around five prototypical aspects of corpus linguistics that may

research.aston.ac.uk favicon

aston

https://research.aston.ac.uk/en/publications/corpus-linguistics-and-sexuality

[218] Corpus Linguistics and Sexuality - Aston Research Explorer This chapter demonstrates how corpus methods have been applied to research on language and sexuality, enabling both examination of language usage and representation of sexual identities and practices. This is followed by a case study that considers changing press discourses concerning a gay professional football player, Justin Fashanu.

academia.edu favicon

academia

https://www.academia.edu/43922504/Motschenbacher_Heiko_2019_Methods_in_language_gender_and_sexuality_studies_An_overview_Wiener_Slawistischer_Almanach_84_43_79

[219] Motschenbacher, Heiko (2019): "Methods in language, gender and ... While most of the methods identified in this overview are largely restricted to one of the three core areas of LGS studies, corpus linguistics represents a methodology 18 that can be put to productive use in all three research areas: work on linguistic structures and their usage, sociolinguistic studies and discursive studies on gender and

compass.onlinelibrary.wiley.com favicon

wiley

https://compass.onlinelibrary.wiley.com/doi/10.1111/lnc3.12147

[220] Integrating Intersectionality in Language, Gender, and Sexuality ... In this paper, I argue for the need to integrate intersectionality theory more fully in language, gender, and sexuality research. I outline the basic principles of what an intersectional approach to identity and identity-linked speech entails, focusing particularly on the belief that an adequate description of lived experience, and hence social practice, requires us to consider the ways in

cell.com favicon

cell

https://www.cell.com/heliyon/pdf/S2405-8440(23

[222] PDF as a pedagogical tool, but challenges exist that limit its positive impact on language learning. Tailored tasks, auxiliary guidance, supplemental support, and peer/group learning were identi-fied as effective strategies for facilitating meaningful corpus engagement for lower-proficiency students. 1. Introduction

sciencedirect.com favicon

sciencedirect

https://www.sciencedirect.com/science/article/pii/S2405844023099395

[234] Language corpus and data driven learning (DDL) in language classrooms ... Scholarly conceptualizations of 'corpus' converge on several core features. McEnery and Wilson [[ ], p. 32] define a corpus as "a finite-size body of machine-readable text, sampled in order to be maximally representative of the language variety under consideration".This denotes that a corpus is characterised by sampling, representativeness, finite size, machine readability, and a

files.eric.ed.gov favicon

ed

https://files.eric.ed.gov/fulltext/EJ1201925.pdf

[241] PDF Nevertheless, the participants stated their concerns regarding the use of corpus-based language pedagogy due to challenges in using technology and designing corpus-based materials. The findings of this study may shed light on teachers' classroom practices of corpus-based language pedagogy in vocabulary instruction.

cambridge.org favicon

cambridge

https://www.cambridge.org/core/journals/recall/article/teachers-perceived-corpus-literacy-and-their-intention-to-integrate-corpora-into-classroom-teaching-a-survey-study/C9A34891029CECE6A268770D44E13BEA

[244] Teachers' perceived corpus literacy and their intention to integrate ... While corpus linguistics and corpora use have energised language research in recent decades, few teachers have integrated them into their classroom teaching, partially due to the absence of in- and pre-service teacher training (Boulton, Reference Boulton 2017; Breyer, Reference Breyer 2009; Callies, Reference Callies, Götz and Mukherjee 2019

academia.edu favicon

academia

https://www.academia.edu/108678923/Corpora_in_English_Language_Teaching_Classroom_Activities_for_Teachers_New_to_Corpus_Linguistics

[246] (PDF) Corpora in English Language Teaching: Classroom Activities for ... Corpus linguistics has seen an expansion in scope over the past several decades. It has also found its way into language classrooms. Corpus-based approaches have gradually become a common practice among English teachers. Some research studies have established that corpus-based approaches are extremely beneficial to both teacher and learner.

wordery.com favicon

wordery

https://wordery.com/challenges-in-corpus-linguistics-mark-kaunisto-editor-marco-schilk-editor-9789027215888

[247] Challenges in Corpus Linguistics : Rethinking Corpus Compilation and ... This volume contributes to the discussion of challenges faced in different areas of corpus linguistics, namely the compilation, annotation, and analysis of linguistic corpora. In a field of growing corpus sizes and expanding possibilities of gathering data, some old issues persist, while at the same time new problems have emerged. As the compilation and study of language corpora gets